Search results for "Data stream"

showing 10 items of 50 documents

Hyperspectral dimensionality reduction for biophysical variable statistical retrieval

2017

Abstract Current and upcoming airborne and spaceborne imaging spectrometers lead to vast hyperspectral data streams. This scenario calls for automated and optimized spectral dimensionality reduction techniques to enable fast and efficient hyperspectral data processing, such as inferring vegetation properties. In preparation of next generation biophysical variable retrieval methods applicable to hyperspectral data, we present the evaluation of 11 dimensionality reduction (DR) methods in combination with advanced machine learning regression algorithms (MLRAs) for statistical variable retrieval. Two unique hyperspectral datasets were analyzed on the predictive power of DR + MLRA methods to ret…

010504 meteorology & atmospheric sciencesMean squared errorComputer science0211 other engineering and technologies02 engineering and technologycomputer.software_genre01 natural sciencessymbols.namesakeLinear regressionComputers in Earth SciencesEngineering (miscellaneous)Gaussian processHyMap021101 geological & geomatics engineering0105 earth and related environmental sciencesData stream miningbusiness.industryDimensionality reductionHyperspectral imagingPattern recognitionAtomic and Molecular Physics and OpticsComputer Science ApplicationsKernel (statistics)symbolsData miningArtificial intelligencebusinesscomputerISPRS Journal of Photogrammetry and Remote Sensing
researchProduct

Summarizing the state of the terrestrial biosphere in few dimensions

2020

Abstract. In times of global change, we must closely monitor the state of the planet in order to understand the full complexity of these changes. In fact, each of the Earth's subsystems – i.e., the biosphere, atmosphere, hydrosphere, and cryosphere – can be analyzed from a multitude of data streams. However, since it is very hard to jointly interpret multiple monitoring data streams in parallel, one often aims for some summarizing indicator. Climate indices, for example, summarize the state of atmospheric circulation in a region. Although such approaches are also used in other fields of science, they are rarely used to describe land surface dynamics. Here, we propose a robust method to crea…

0106 biological sciences010504 meteorology & atmospheric sciencesAtmospheric circulationlcsh:Life0207 environmental engineering02 engineering and technology010603 evolutionary biology01 natural scienceslcsh:QH540-549.5Cryosphere020701 environmental engineeringEcology Evolution Behavior and Systematics0105 earth and related environmental sciencesEarth-Surface ProcessesData stream mininglcsh:QE1-996.5BiosphereGlobal change15. Life on landAlbedolcsh:Geologylcsh:QH501-531Arctic13. Climate actionClimatologyEnvironmental sciencelcsh:EcologyHydrosphere
researchProduct

A Methodology to Derive Global Maps of Leaf Traits Using Remote Sensing and Climate Data

2018

This paper introduces a modular processing chain to derive global high-resolution maps of leaf traits. In particular, we present global maps at 500 m resolution of specific leaf area, leaf dry matter content, leaf nitrogen and phosphorus content per dry mass, and leaf nitrogen/phosphorus ratio. The processing chain exploits machine learning techniques along with optical remote sensing data (MODIS/Landsat) and climate data for gap filling and up-scaling of in-situ measured leaf traits. The chain first uses random forests regression with surrogates to fill gaps in the database (> 45% of missing entries) and maximizes the global representativeness of the trait dataset. Plant species are then a…

0106 biological sciencesFOS: Computer and information sciences010504 meteorology & atmospheric sciencesSpecific leaf areaClimateBos- en LandschapsecologieSoil ScienceFOS: Physical sciencesApplied Physics (physics.app-ph)010603 evolutionary biology01 natural sciencesStatistics - ApplicationsGoodness of fitAbundance (ecology)Machine learningForest and Landscape EcologyApplications (stat.AP)Computers in Earth SciencesPlant ecologyVegetatie0105 earth and related environmental sciencesRemote sensingMathematics2. Zero hungerPlant traitsVegetationData stream miningClimate; Landsat; Machine learning; MODIS; Plant ecology; Plant traits; Random forests; Remote sensing; Soil Science; Geology; Computers in Earth SciencesGlobal MapRegression analysisGeologyPhysics - Applied Physics15. Life on landRandom forestsRemote sensingPE&RCRandom forestMODISTraitVegetatie Bos- en LandschapsecologieVegetation Forest and Landscape EcologyLandsat
researchProduct

Towards Quantifying Non-Photosynthetic Vegetation for Agriculture Using Spaceborne Imaging Spectroscopy

2021

Non-photosynthetic vegetation (NPV) has been identified as priority variable in the context of new spaceborne imaging spectroscopy missions. In this study we provide a first attempt to quantify NPV biomass from these unprecedented data streams to be provided by multiple recently launched or planned instruments. A hybrid workflow is proposed including Gaussian process regression (GPR) trained over radiative transfer model (RTM) simulations and applying active learning strategies. A soybean field data set including two dates with NPV measurements on yellow and senescent (brown) plant organs was used for model validation, resulting in relative errors of 13.4%. This prototype retrieval model wa…

2. Zero hunger010504 meteorology & atmospheric sciencesData stream mining0211 other engineering and technologiesEnMAPHyperspectral imagingContext (language use)PRISMA02 engineering and technologyVegetationVegetation functional trait01 natural sciencesLigninImaging spectroscopyAtmospheric radiative transfer codesWorkflowHybrid approacheCHIMEKrigingEnvironmental scienceCelluloseGaussian process regression021101 geological & geomatics engineering0105 earth and related environmental sciencesRemote sensing
researchProduct

Earth system data cubes unravel global multivariate dynamics

2020

Understanding Earth system dynamics in light of ongoing human intervention and dependency remains a major scientific challenge. The unprecedented availability of data streams describing different facets of the Earth now offers fundamentally new avenues to address this quest. However, several practical hurdles, especially the lack of data interoperability, limit the joint potential of these data streams. Today, many initiatives within and beyond the Earth system sciences are exploring new approaches to overcome these hurdles and meet the growing interdisciplinary need for data-intensive research; using data cubes is one promising avenue. Here, we introduce the concept of Earth system data cu…

Agriculture and Food SciencesDECOMPOSITION0106 biological sciencesFLUXESDependency (UML)lcsh:Dynamic and structural geology010504 meteorology & atmospheric sciencesInterface (Java)Computer scienceDIMENSIONALITY010603 evolutionary biology01 natural sciencesESAData cube03 medical and health scienceslcsh:QE500-639.5TEMPERATURE SENSITIVITYlcsh:Science030304 developmental biology0105 earth and related environmental sciences0303 health sciencesData stream mininglcsh:QE1-996.5SCIENCEFRAMEWORKData sciencePRODUCTSlcsh:GeologyMODELEarth system scienceVariable (computer science)Workflow13. Climate actionGeneral Earth and Planetary Scienceslcsh:QSOIL RESPIRATIONCurse of dimensionality
researchProduct

On the Classification of Dynamical Data Streams Using Novel “Anti–Bayesian” Techniques

2018

The classification of dynamical data streams is among the most complex problems encountered in classification. This is, firstly, because the distribution of the data streams is non-stationary, and it changes without any prior “warning”. Secondly, the manner in which it changes is also unknown. Thirdly, and more interestingly, the model operates with the assumption that the correct classes of previously-classified patterns become available at a juncture after their appearance. This paper pioneers the use of unreported novel schemes that can classify such dynamical data streams by invoking the recently-introduced “Anti- Bayesian” (AB) techniques. Contrary to the Bayesian paradigm, that compar…

Anti-Bayesian classificationData streams
researchProduct

Sequential Mining Classification

2017

Sequential pattern mining is a data mining technique that aims to extract and analyze frequent subsequences from sequences of events or items with time constraint. Sequence data mining was introduced in 1995 with the well-known Apriori algorithm. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this technique became useful in many applications: DNA researches, medical diagnosis and prevention, telecommunications, etc. GSP, SPAM, SPADE, PrefixSPan and other advanced algorithms followed. View the evolution of data mining techniques based on sequential data, this paper discusses the multiple …

Apriori algorithmComputer sciencebusiness.industryData stream miningConcept mining02 engineering and technologycomputer.software_genreMachine learningGSP AlgorithmTree (data structure)Statistical classificationComputingMethodologies_PATTERNRECOGNITION020204 information systems0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingData miningArtificial intelligencebusinessK-optimal pattern discoverycomputerFSA-Red Algorithm2017 International Conference on Computer and Applications (ICCA)
researchProduct

Modeling Multi-label Recurrence in Data Streams

2019

Most of the existing data stream algorithms assume a single label as the target variable. However, in many applications, each observation is assigned to several labels with latent dependencies among them, which their target function may change over time. Classification of such non-stationary multi-label streaming data with the consideration of dependencies among labels and potential drifts is a challenging task. The few existing studies mostly cope with drifts implicitly, and all learn models on the original label space, which requires a lot of time and memory. None of them consider recurrent drifts in multi-label streams and particularly drifts and recurrences visible in a latent label spa…

Change over timeMulti-label classificationData streambusiness.industryComputer scienceData stream miningSpace dimensionPattern recognitionComputingMethodologies_PATTERNRECOGNITIONStreaming dataArtificial intelligencebusinessClassifier (UML)Decoding methods2019 IEEE International Conference on Big Knowledge (ICBK)
researchProduct

Structural clustering of millions of molecular graphs

2014

We propose an algorithm for clustering very large molecular graph databases according to scaffolds (i.e., large structural overlaps) that are common between cluster members. Our approach first partitions the original dataset into several smaller datasets using a greedy clustering approach named APreClus based on dynamic seed clustering. APreClus is an online and instance incremental clustering algorithm delaying the final cluster assignment of an instance until one of the so-called pending clusters the instance belongs to has reached significant size and is converted to a fixed cluster. Once a cluster is fixed, APreClus recalculates the cluster centers, which are used as representatives for…

Clustering high-dimensional dataFuzzy clusteringTheoretical computer sciencek-medoidsComputer scienceSingle-linkage clusteringCorrelation clusteringConstrained clusteringcomputer.software_genreComplete-linkage clusteringGraphHierarchical clusteringComputingMethodologies_PATTERNRECOGNITIONData stream clusteringCURE data clustering algorithmCanopy clustering algorithmFLAME clusteringAffinity propagationData miningCluster analysiscomputerk-medians clusteringClustering coefficientProceedings of the 29th Annual ACM Symposium on Applied Computing
researchProduct

On the Online Classification of Data Streams Using Weak Estimators

2016

In this paper, we propose a novel online classifier for complex data streams which are generated from non-stationary stochastic properties. Instead of using a single training model and counters to keep important data statistics, the introduced online classifier scheme provides a real-time self-adjusting learning model. The learning model utilizes the multiplication-based update algorithm of the Stochastic Learning Weak Estimator (SLWE) at each time instant as a new labeled instance arrives. In this way, the data statistics are updated every time a new element is inserted, without requiring that we have to rebuild its model when changes occur in the data distributions. Finally, and most impo…

Complex data typeTraining setLearning automataComputer sciencebusiness.industryData stream miningEstimator020206 networking & telecommunications02 engineering and technologycomputer.software_genreMachine learning0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingData miningArtificial intelligencebusinesscomputerClassifier (UML)Juncture
researchProduct